Fast approximation of matrix coherence and statistical leverage

نویسندگان

  • Michael W. Mahoney
  • Petros Drineas
  • Malik Magdon-Ismail
  • David P. Woodruff
چکیده

The statistical leverage scores of a matrix A are the squared row-norms of the matrix containing its (top) left singular vectors and the coherence is the largest leverage score. These quantities have been of interest in recently-popular problems such as matrix completion and Nyström-based low-rank matrix approximation; in large-scale statistical data analysis applications more generally; and since they define the key structural nonuniformity that must be dealt with in developing fast randomized matrix algorithms. Our main result is a randomized algorithm that takes as input an arbitrary n × d matrix A, with n ≫ d, and that returns as output relative-error approximations to all n of the statistical leverage scores. The proposed algorithm runs (under assumptions on the precise values of n and d) in O(nd log n) time, as opposed to the O(nd) time required by the näıve algorithm that involves computing an orthogonal basis for the range of A. Our analysis may be viewed in terms of computing a relative-error approximation to an underconstrained least-squares approximation problem, or, relatedly, it may be viewed as an application of Johnson-Lindenstrauss type ideas. Several practically-important extensions of our basic result are also described, including the approximation of so-called cross-leverage scores, the extension of these ideas to matrices with n ≈ d, and the extension to streaming environments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compressive Parameter Estimation with Earth Mover’s Distance via K-Median Clustering

In recent years, sparsity and compressive sensing have attracted significant attention in parameter estimation tasks, including frequency estimation, delay estimation, and localization. Parametric dictionaries collect observations for a sampling of the parameter space and can yield sparse representations for the signals of interest when the sampling is su ciently dense. While this dense samplin...

متن کامل

ALEVS: Active Learning by Statistical Leverage Sampling

Active learning aims to obtain a classifier of high accuracy by using fewer label requests in comparison to passive learning by selecting effective queries. Many active learning methods have been developed in the past two decades, which sample queries based on informativeness or representativeness of unlabeled data points. In this work, we explore a novel querying criterion based on statistical...

متن کامل

Fast Randomized Kernel Methods With Statistical Guarantees

One approach to improving the running time of kernel-based machine learning methods is to build a small sketch of the input and use it in lieu of the full kernel matrix in the machine learning task of interest. Here, we describe a version of this approach that comes with running time guarantees as well as improved guarantees on its statistical performance. By extending the notion of statistical...

متن کامل

A Novel Image Denoising Method Based on Incoherent Dictionary Learning and Domain Adaptation Technique

In this paper, a new method for image denoising based on incoherent dictionary learning and domain transfer technique is proposed. The idea of using sparse representation concept is one of the most interesting areas for researchers. The goal of sparse coding is to approximately model the input data as a weighted linear combination of a small number of basis vectors. Two characteristics should b...

متن کامل

Max-plus statistical leverage scores

The statistical leverage scores of a complex matrix A ∈ Cn×d record the degree of alignment between col(A) and the coordinate axes in Cn. These score are used in random sampling algorithms for solving certain numerical linear algebra problems. In this paper we present a max-plus algebraic analogue for statistical leverage scores. We show that max-plus statistical leverage scores can be used to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 13  شماره 

صفحات  -

تاریخ انتشار 2012